AITopics | non-adaptive neural network optimization method

Collaborating Authors

non-adaptive neural network optimization method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bayesian filtering unifies adaptive and non-adaptive neural network optimization methods

Neural Information Processing SystemsDec-24-2025, 16:40:34 GMT

We formulate the problem of neural network optimization as Bayesian filtering, where the observations are backpropagated gradients. While neural network optimization has previously been studied using natural gradient methods which are closely related to Bayesian inference, they were unable to recover standard optimizers such as Adam and RMSprop with a root-mean-square gradient normalizer, instead getting a mean-square normalizer. To recover the root-mean-square normalizer, we find it necessary to account for the temporal dynamics of all the other parameters as they are optimized. The resulting optimizer, AdaBayes, adaptively transitions between SGD-like and Adam-like behaviour, automatically recovers AdamW, a state of the art variant of Adam with decoupled weight decay, and has generalisation performance competitive with SGD.

bayesian, name change, non-adaptive neural network optimization method, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.57)

Add feedback

Review for NeurIPS paper: Bayesian filtering unifies adaptive and non-adaptive neural network optimization methods

Neural Information Processing SystemsFeb-6-2025, 15:33:01 GMT

Summary and Contributions: Post-rebuttal: Dear authors, thank you for your detailed response and offering to fix many points we raised. I would like to sum up my thoughts after having read the other reviews and your rebuttal: On a high level, the following aspects were most significant how I approached towards my final score: 1) The perspective is novel, and has interesting potential. Re 1: I think we all agree that this is a pro for the paper and should be considered its main strength. Re 2: Questioning the approximations is a valid point. However, as you argue, you provided sufficient empirical evidence for the mini-batch Gaussianity, and I think that Gaussianity is often assumed without further justification in other Bayesian inference applications as well, simply to keep the computations tractable. Even if the assumptions are not fully realistic, they seem to be "less concerning than those in past work" (rebuttal, line 19).

artificial intelligence, machine learning, non-adaptive neural network optimization method, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.42)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.40)

Add feedback

Review for NeurIPS paper: Bayesian filtering unifies adaptive and non-adaptive neural network optimization methods

Neural Information Processing SystemsFeb-6-2025, 15:32:53 GMT

After a discussion with the reviewers, I converged towards recommending to accept this submission. The reviewers raised the following aspects: 1) The perspective is novel, and has interesting potential. Re 1: all reviewers agree that this is a pro for the paper and should be considered its main strength. The authors agree (rebuttal, lines 23-25). Re 2: R3 believes that questioning the approximations is a valid point. However, as the authors argue, they have provided sufficient empirical evidence for mini-batch Gaussianity in appendix B, and Gaussianity is sometimes assumed without further justification in other Bayesian inference applications as well, simply to keep the computations tractable.

neurips paper, non-adaptive neural network optimization method, reviewer, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Bayesian filtering unifies adaptive and non-adaptive neural network optimization methods

Neural Information Processing SystemsOct-11-2024, 10:57:33 GMT

bayesian, non-adaptive neural network optimization method, normalizer

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.77)

Add feedback